scaling memory-augmented neural network
Scaling Memory-Augmented Neural Networks with Sparse Reads and Writes
Neural networks augmented with external memory have the ability to learn algorithmic solutions to complex tasks. These models appear promising for applications such as language modeling and machine translation. However, they scale poorly in both space and time as the amount of memory grows --- limiting their applicability to real-world domains. Here, we present an end-to-end differentiable memory access scheme, which we call Sparse Access Memory (SAM), that retains the representational power of the original approaches whilst training efficiently with very large memories. We show that SAM achieves asymptotic lower bounds in space and time complexity, and find that an implementation runs $1,\!000\times$ faster and with $3,\!000\times$ less physical memory than non-sparse models. SAM learns with comparable data efficiency to existing models on a range of synthetic tasks and one-shot Omniglot character recognition, and can scale to tasks requiring $100,\!000$s of time steps and memories. As well, we show how our approach can be adapted for models that maintain temporal associations between memories, as with the recently introduced Differentiable Neural Computer.
Scaling Memory-Augmented Neural Networks with Sparse Reads and Writes
Neural networks augmented with external memory have the ability to learn algorithmic solutions to complex tasks. These models appear promising for applications such as language modeling and machine translation. However, they scale poorly in both space and time as the amount of memory grows --- limiting their applicability to real-world domains. Here, we present an end-to-end differentiable memory access scheme, which we call Sparse Access Memory (SAM), that retains the representational power of the original approaches whilst training efficiently with very large memories. We show that SAM achieves asymptotic lower bounds in space and time complexity, and find that an implementation runs 1,\!000\times faster and with 3,\!000\times less physical memory than non-sparse models. SAM learns with comparable data efficiency to existing models on a range of synthetic tasks and one-shot Omniglot character recognition, and can scale to tasks requiring 100,\!000 s of time steps and memories.
Reviews: Scaling Memory-Augmented Neural Networks with Sparse Reads and Writes
The novelty in this work is thin, see e.g. On the other hand, I have not seen such tools used for writeable memories, and even if there is not a conceptual leap in this paper, certainly the practical aspects of making it work are worth reporting. Nevertheless, I do recommend accepting the paper, but would like to see some changes. Especially the descriptions of the tasks are lacking; considering the code for the tasks from the original NTM paper have not been released, and there is no "standard" version of the tasks, it is crucial that the paper give careful descriptions of the construction of the tasks. I would also ask that the authors commit to releasing the code for their experiments.
Scaling Memory-Augmented Neural Networks with Sparse Reads and Writes
Neural networks augmented with external memory have the ability to learn algorithmic solutions to complex tasks. These models appear promising for applications such as language modeling and machine translation. However, they scale poorly in both space and time as the amount of memory grows -- limiting their applicability to real-world domains. Here, we present an end-to-end differentiable memory access scheme, which we call Sparse Access Memory (SAM), that retains the representational power of the original approaches whilst training efficiently with very large memories. We show that SAM achieves asymptotic lower bounds in space and time complexity, and find that an implementation runs 1,000 faster and with 3,000 less physical memory than non-sparse models. SAM learns with comparable data efficiency to existing models on a range of synthetic tasks and one-shot Omniglot character recognition, and can scale to tasks requiring 100,000s of time steps and memories. As well, we show how our approach can be adapted for models that maintain temporal associations between memories, as with the recently introduced Differentiable Neural Computer.
Scaling Memory-Augmented Neural Networks with Sparse Reads and Writes
Rae, Jack, Hunt, Jonathan J., Danihelka, Ivo, Harley, Timothy, Senior, Andrew W., Wayne, Gregory, Graves, Alex, Lillicrap, Timothy
Neural networks augmented with external memory have the ability to learn algorithmic solutions to complex tasks. These models appear promising for applications such as language modeling and machine translation. However, they scale poorly in both space and time as the amount of memory grows --- limiting their applicability to real-world domains. Here, we present an end-to-end differentiable memory access scheme, which we call Sparse Access Memory (SAM), that retains the representational power of the original approaches whilst training efficiently with very large memories. We show that SAM achieves asymptotic lower bounds in space and time complexity, and find that an implementation runs $1,\!000\times$ faster and with $3,\!000\times$ less physical memory than non-sparse models.
[R][1610.09027] Scaling Memory-Augmented Neural Networks with Sparse Reads and Writes [DeepMind] • /r/MachineLearning
I use episodic memory so there is no write head. The idea is that instead of determining what you want to store and where to store it, you store everything in one summary state. The summary state is written in memory at every time step. The problem is then to learn to retrieve a previous summary state that helps with the current computation. At every time step, the network generates a retrieval key and mask for one state retrieval.
Scaling Memory-Augmented Neural Networks with Sparse Reads and Writes
Rae, Jack, Hunt, Jonathan J., Danihelka, Ivo, Harley, Timothy, Senior, Andrew W., Wayne, Gregory, Graves, Alex, Lillicrap, Timothy
Neural networks augmented with external memory have the ability to learn algorithmic solutions to complex tasks. These models appear promising for applications such as language modeling and machine translation. However, they scale poorly in both space and time as the amount of memory grows --- limiting their applicability to real-world domains. Here, we present an end-to-end differentiable memory access scheme, which we call Sparse Access Memory (SAM), that retains the representational power of the original approaches whilst training efficiently with very large memories. We show that SAM achieves asymptotic lower bounds in space and time complexity, and find that an implementation runs $1,\!000\times$ faster and with $3,\!000\times$ less physical memory than non-sparse models. SAM learns with comparable data efficiency to existing models on a range of synthetic tasks and one-shot Omniglot character recognition, and can scale to tasks requiring $100,\!000$s of time steps and memories. As well, we show how our approach can be adapted for models that maintain temporal associations between memories, as with the recently introduced Differentiable Neural Computer.